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SPECIFICATION 

GLOBAL SIGNAL DISTRIBUTION ARCHITECTURE 
IN A FIELD PROGRAMMABLE GATE ARRAY 

BACKGROUND OF THE INVENTION 

1. Field Of The Invention 

The present invention relates to global signal distribution architectures. More 
particularly, the present invention relates to global signal distribution architecture in a 
. field programmable gate array (FPGA) that is reprogrammable. 

2. The Prior Art 

In the acceptance of an FPGA with capacity of up to a million gates, it is crucial 
that high speed fan-out nets be implemented in the FPGA design. High fanout nets in an 
FPGA are typically separated into four categories: global utilities, local clock/set/reset, 
control signals and high fanout data. 

Examples of global utilities are clock, set or reset signals that define the main 
clock domains in the device by reaching many of the flip-flops in the FPGA. The local 
clock/set/reset signals, though they may have a medium to high fanout typically occur 
infrequently. 

Well known examples of control signals include flip-flop enable and multiplexer 
select, however, control function can be more generally described as being orthogonal 
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to the data flow in the design. Such signals have a medium to high fanout and may 
occur frequently in a design. Another important characteristic of a control signal, is that 
the source of the control signal may originate in a different logic component such that 
the control signal driver may not be situated in the same physical hierarchy as the load 
of the coritrorsignal driver. The category of high fanout data are those high fanout 
signals that do not qualify as control signals. 

In the design of high speed fanout nets the requirements of each of these four - 
categories of high fanout signals must be taken into consideration. Accordingly, it is an 
object of the present invention to meet the requirement of all four categories of high 
fanout signals, while maintaining flexibility in the FPGA. 

BRIEF DESCRIPTION OF THE nWRNTTON 
According to the present invention, global utility signals that have their origin 
from input pins to the FPGA or from internal signals in the FPGA are distributed by a 
global distribution architecture to entities in the FPGA hierarchy known as clusters. The 
global distribution architecture includes I/O blocks that have I/O pins, buffers, boundary 
scan registers, interconnect to delay locked loops, and a global I/O (GIO) routing 
channel that includes global interconnect conductors which are coupled to the outputs 
of global multiplexers. * 

A global signal may be provided through a global multiplexer oh the GIO routing 
channel from any of four locations, namely, an input buffer, the output of a BSR, from 
the output of a DLL, and from an internal FPGA signal. The GIO routing channels from 
the I/O blocks at both the top and bottom of the FPGA architecture are coupled into a 
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B 16x16 tile of the FPGA architecture to form a global signal bus (GSB) in each of the 
B 16x1 6 tiles of the FPGA architecture. 

G3 routing channels run adjacent the GSB. The G3 routing channels can be 
coupled to selected interconnect conductors in the expressway routing channels M3. 

As the G3 routing channels and the GSB traverse a B 16x16 tile, they pass 
through G3MAT switch matrices that provide access to B4x4 tiles in the FPGA 
architecture. The G3 routing channels and the GSB are also coupled to user SRAM 
modules associated with a B 16x16 die. <: v: 

A G3MAT switching matrix switches the signals on the GSB an a G3 routing 
channel onto a B4x4 utility routing channel. The GSB has GENERAL 
PURPOSE/ENABLE interconnect conductors, SET interconnect conductors, and 
CLOCK interconnect conductors. 

At the lowest level in the FPGA architecture, the utilities from a G3MAT switch 
matrix are coupled to the inputs of four 8-input multiplexers labelled G, E, C, and S that 
select from the GENERAL PURPOSE signals, the ENABLE signals, the CLOCK signals 
and the SET signals. The outputs of multiplexers G, E, C, and S are coupled into each of 
the four clusters in the lowest level of the FPGA architecture on signal lines G, C, S, and 
E that correspond to the multiplexers labelled G, E, C, and S. 

BRIEF DESCRIPTION OF THE DRAWINGS 
FIG. 1 is a block diagram of a semi-hierarchical FPGA architecture with the global 
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signal distribution architecture according to the present invention 

FIG. 2 is a block diagram of a B 16x16 tile in a an FPGA and the associated 
routing resources in the middle level of semi-hierarchical architecture according to the 
present invention. 



FIG. 3 is a block diagram of a B2x2 tile in an FPGA and the connection of the 
routing resources in the lowest level to the middle level of a semi-hierarchical 
architecture according to the present invention. 

FIG. 4 is a block diagram of a B2x2 tile in an FPGA and the routing resources in 
the lowest level of a semi-hierarchical architecture according to the present invention. 

FIG. 5 is a schematic diagram of the I/O block depicted in FIG 1 according to the 
present invention. . 

FIG. 6 illustrates the distribution of global signals to various circuits in an • 
B 16x1 6 tile by a global distribution bus according to the present invention. 

FIG. 7 illustrates a switching matrix for a global signal bus and a G3 routing 
channel according to the present invention. 

FIG. 8 illustrates the connection of a global signal bus and G3 routing channels to 
user SRAM blocks according to the present invention. 
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FIG. 9 illustrates the connection of utilities signals from a switching matrix to the 
clusters in a Blblock according to the present invention. 

DETAILED DESCRIPTION OF A PREFERRED EMBODIMENT 
Those of ordinary skill in the art will realize that the following description of the 
present invention is illustrative only and not in any way limiting. Other embodiments of 
the invention will readily suggest themselves to such skilled persons. 

The present invention is implemented in a semi-hierarchical FPGA architecture, 
having top, middle and low levels. According to the present invention, signals with high 
fanout may be distributed into the lowest level in the semi-hierarchical FPGA as global 
utility signals that have their origin from input pins to the FPGA at the highest level in 
the semi-hierarchical or from internal signals in the FPGA. To better understand the 
present invention, a description of the three levels of the semi-hierarchical architecture is 
made herein. 

Turning now to FIG. 1 a block diagram of the top level of a semi-hierarchical 
architecture FPGA 10 with a global signal distribution architecture according to the • 
present invention is illustrated The top level of the architecture is an array of the 
B16xl6 tiles 12 arranged in a rectangular array. A B16xl6 tile 12 is a sixteen by sixteen 
array of Bl blocks. As will be described in detail below, a B 16x16 tile 12 and its 
associated routing resources represents the middle level in the semi-hierarchical 
architecture, and a Bl block and its associated routing resources represents the lowest 
level in the semi-hierarchical architecture. 

5 
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According to the present invention, the B 16x16 tiles 12 are enclosed by I/O 
blocks 14 on the periphery. Each of the I/O blocks 14 include I/O pins, buffers, and 
boundary scan registers. Further, each of the I/O blocks 14 on the top and the bottom of 
the FPGA 10 include global I/O (GIO) routing channels 16-1 and 16-2, respectively, that 
preferably include sixteen interconnect conductors. The GIO routing channels 16 are 
coupled into each of the B16xl6 tiles. 12 to form a 32-bit global signal bus (GSB) 18 in 
each of the B 16x16 tiles 12. .The I/O blocks 14 on the top and the bottom of the FPGA 
10, a GIO routing channel 16, and a GSB 18 will be described in greater detail below. 

On each of the four sides of a B 16x16 tile 12, and also associated with each of the 
I/O blocks 14 is freeway routing channel 20. It should be appreciated that on each side 
of a B 16x16 die 12 there are two freeway routing channels 20, either as a result of the 
disposition of two freeway routing channels 20 between adjacent B 16x16 tiles 12 or as 
a result of the disposition of two freeway routing channels 20. between a B 16x1 6 tile 12 
and an adjacent I/O block 14. . 

It should be appreciated that the number of B 16x16 tiles 12 in the FPGA 10 may 
be fewer or greater than the four shown in FIG. 1 . The width of a freeway routing 
channel 20 in the FPGA 10 can be changed to accommodate different numbers of 
B 16x16 tiles 12 without disturbing the internal structure of the B 16x16 tiles 12. In this 
manner, the fioorplan of the FPGA 10 can readily be custom sized by including the 
desired number of B 16x16 tiles 12 in the design. 

The freeway routing channels 20 can be extended in any combination of 
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directions at each end by a freeway turn matrix (F-turn) 22. An F-turn 22 is an active 
device that includes tri-state buffers and a matrix of reprogrammable switches. The 
reprogrammable switches are preferably SRAM pass devices. The interconnect 
conductors in the freeway routing channels 20 that are fed into an F-turn 22 may be 
coupled to many of the other interconnect conductors in the freeway routing channels 
20 that come into the F-turn 22 by the programmable switches. Further, the 
interconnect conductors in the freeway routing channels 16 that are fed into an F-turn 
22 continue in the same direction through the F-turn 22, even though the interconnect 
conductors are coupled to other interconnect conductors by the reprogrammable 
switches. A description of the implementation of an F-turn 22 is beyond the scope of 
this disclosure and will not be made herein to avoid overcomplicating the disclosure and 
thereby obscuring the present invention. 

The freeway routing channels 20 along with the F-turns 22 form a course mesh. A 
freeway- routing channel 20 will very rarely be utilized all by itself without any 
extension, since such distances are abundantly covered by the routing resources in the 
middle hierarchy to be described below. A freeway routing "channel 20 is primarily 
intended to be used in conjunction with one or more other freeway routing channel 20 
in any direction that together can span a distances of two or more B 16x16 tiles 12. 

The freeway routing channels 20 are hardwired to selected ones of the 
interconnect conductors in the GIO routing channels 16 by global-freeway (G-F) turns 
24. This provides an additional path for routing signals between the freeway routing 
channels 20 and the GIO routing channels 16. The selection of coupling a freeway 
routing channel 20 to one of the hardwired connections in the G-F turn 24 is made by 
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the F-turn 22 that. is adjacent the G-F cum 24. 

In FIG. 2, a block diagram of a B 16x1 6 tile 12 and the associated routine 
resources in the middle level of hierarchy is illustrated. The B 16x16 tile 12 is a sixteen 
by sixteen array of Bf blocks 30. To avoid overcomplicating the drawing figure, only 
the B 1 blocks 30 in a single row and a single column are indicated by the reference 
numeral 30. The B 16x16 tile 12 is based on the repetition and nesting of smaller 
groupings (tiles) of Bl blocks 30. The smallest tile that is directly replicated and stepped 
is a B2x2 tile 32 that includes a two by two array of four Bl blocks 30. The B2x2 tiles 
32 are stepped into a four by four array of sixteen Bl blocks 30 in a B4x4 tile 34, and 
the B4x4 tiles 34 are stepped into a eight by eight array of sixty-four B 1 blocks 30 in a 
B8x8tile36. A B 16x16 tile 12 includes four B8x8 tiles 36. 

Though not depicted in FIG. 2, the B 16x16 tile 12 further includes a block of user 
assignable static random access memory (SRAM), disposed between the two upper 
B8x8 tiles 36, and a block of SRAM disposed between the two lower B8x8 tiles 36.. 
According to the present invention, the SRAM blocks will be described in greater detail 
below. 

The routing resources in the middle level of hierarchy are termed expressway 
routing channels. There are three types of expressway routing channels, namely Ml, 
M2, and M3. In FIG. 2, only a single row and a single column of expressway routing 
channels Ml, M2, and M3 are denominated to avoid overcomplicating the drawing 
figure. In a preferred embodiment of the present invention, there is a single group of 
nine interconnect conductors in an Ml expressway routing channel, two groups of nine 
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interconnect conductors in an M2 expressway routing channel, and six groups of nine 
interconnect conductors in an M3 expressway routing channel. 

The expressway routing channels Ml, M2, and M3 are segmented so that each 
expressway routing channel Ml, M2, and M3 spans a distance of a B2x2 tile 32, a B4x4 
tile 34, and a B8x8 tile 36, respectively. Between each of the segments in the 
expressway routing channels Ml, M2, and M3 are disposed extensions that can extend 
the expressway routing channel Ml, M2, or M3 an identical distance along the same 
direction. 

The extensions 38 that couple the segments in the expressway routing channels 
Ml and M2 are passive reprogrammable elements that are preferably an SRAM pass 
device. The extensions 38 provide a one-to-one coupling between the interconnect 
conductors of the expressway routing channels Ml and M2 on either side of the 
extensions 38. To avoid overcomplicating the drawing figure, only the extensions 38 in 
a single row and a single column are indicated by the reference numeral 38. 

The segments of an M3 expressway routing channel are extended at the 
boundary of a B 16x16 tile 12 where an expressway routing channel M3 crosses a 
freeway routing channel 20 by a freeway tab (F-tab) 40, and otherwise by an M3 
extension 42. To avoid overcomplicating the drawing figure, only the F-tabs 40, and 
M3 extensions 42 in a single row and a single column are indicated bv the reference 
numerals 40 and 42, respectively. 



An F-tab 40 is an active device that includes tri-state buffers and a matrix of 
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reprogrammable switches. The reprogrammable switches are preferably an SRAM pass 
device. The interconnect conductors in the freeway routing channels 20 and the 
expressway routing channel M3 that are fed into an F-tab 40 may be coupled by the 
reprogrammable switches to many of the other interconnect conductors in the freeway 
routing channels 20 and-the expressway routing channel M3 that come into the F-tab 
40. A description of the implementation of. the an F-tab 30 is beyond the scope of this 
disclosure and will not be made herein to avoid overcomplicating the disclosure and . 
thereby obscuring the present invention. 

Further, the interconnect conductors in the freeway routing channels 20 and the 
expressway routing channel M3 that are fed into an F-tab 40 continue in the same 
direction through the F-tab 40, even through the interconnect conductors are coupled 
to other interconnect conductors by the reprogrammable switches. When the F-tabs 40 
are disposed between adjacent B 16x16 tile 12, an expressway routing channel M3 
continues on to another expressway routing channel M3. According to the present 
invention, as will be described in greater detail below, when the F-tabs 40 are disposed 
between a B 16x1 6 tile 12 and an I/O block on either the top of the bottom of the FPGA 
10, an expressway routing channel M3 is coupled into the I/O block 14. 

Accordingly, an F-tab 40 implements the dual role of providing. an extension of 
the middle level routing resources in a B 16x16 tile 12 to the middle level routing 
resources in an adjacent B 16x1 6 tile 12 or an I/O block 14, and providing access 
between the middle level routing resources of B 16x16 tile 12 and a freeway routing 
channel 20 in the highest level of the architecture. An F-tab 40 can combine the two 
roles of access and .extension simultaneously in the formation of a single net. 

10 
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An M3 extension 42 is an active device that includes tristatable buffers coupled 
to a matrix of reprogrammable switches. The reprogrammable switches are preferably an 
SRAM pass device. The interconnect conductors in the expressway routing channel 
M3 that are fed into an M3 extension 42 may be coupled by the reprogrammable 
switches to many of the other interconnect conductors in the expressway routing 
channel M3 that come into the M3 extension 42. A description of the implementation 
of an M3 extension 42 is beyond the scope of this disclosure and will not be made 
herein to avoid overcomplicating the disclosure and thereby obscuring the present 
invention. ^' 

As depicted in FIG. 2, all of the expressway routing channels Ml, M2, and M3' 
run both vertically through every column and horizontally through every row of B2x2 
tiles 32. At the intersections of each of the expressway routing channels Ml, M2, arid 
M3 in the horizontal direction with the expressway routing channels Ml, M2 and M3 in 
the vertical direction is disposed an expressway turn (E-turn) 44. To avoid 
overcomplicating the drawing figure, only the E-turns 44 disposed in the B2x2 tiles 22 
in a single row and a single column are indicated by the reference numeral 44. A 
description of the implementation of an E-turn 44 is beyond the scope of this disclosure 
and will not be made herein to avoid overcomplicating the disclosure and thereby 
obscuring the present invention. 

At the lowest level of the semi-hierarchical FPGA architecture, there are three 
types of routing resources, block connect (BC) routing channels, local mesh (LM) 
routing channels, and direct connect (DC) interconnect conductors. According to a 
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preferred embodiment of. the "present invention, there are nine interconnect conductors 
in each BC routing channel and six interconnect conductorsin each-LM routine 
channel. Of these three, the BC routing channels serve the dual purpose of being able 
to both couple Bl blocks 30 together at the lowest level in the architecture, and also 
provide access to the expressway-routing channels Ml, M2, and M3 in the middle level 
of the architecture. In FIG. 3 aspects of the BC routing channels will be described, and 
in FIG. 4 aspects of the LM routing channels and the DC interconnect conductors will 
be described. 

Turning now to FIG. 3, a B2x2 tile 32 including four Bl blocks 30 is illustrated. 
Associated with each of the Bl blocks 30 is a horizontal BC routing channel 50-1 and a 
vertical BC routing channel 50-2. Each horizontal BC routing channel 50-1 and vertical 
BC routing channel 50-2 is coupled to . an expressway tab (E-tab) 52 to provide access 
for each Bl block 30 to the vertical and horizontal expressway routing channels Ml, 
M2, and M3, respectively. 

An E-tab 52 is an active device that includes tri-state buffers and a matrix of 
reprogrammable switches. The reprogrammable switches are preferably an SRAM pass 
device. The interconnect conductors in the BC routing channels 50 and the 
expressway .routing channels Ml, M2, and M3 that are fed into an E-tab 52 may be . 
coupled by .the programmable switches to many of the other interconnect conductors, 
the expressway routing channels Ml, M2, and M3. that come into the E-tab 52. Further, 
the expressway routing channels Ml, M2, and M3 that are fed into an E-tab 52 
continue in the same direction through the E-tab 52, even through the interconnect 
conductors are coupled to other interconnect conductors by the reprogrammable 
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switches. A description of the implementation of an E-tab 52 is beyond the scope of this 
disclosure and will not be made herein to avoid overcomplicating the disclosure and 
thereby obscuring the present invention. 

- At the E-tabs 52, the signals provided on the BC routing channels 50 can 
connect to any of the expressway routing channels Ml, M2, or M3\ Once a signal 
emanating from aBl block 30 has been placed on an expressway routing channel Ml, 
M2 or M3 and traversed a selected distance, an E-tab 52 is employed to direct that 
signal onto a horizontal or vertical BC routing channel 50-1 or 50-2 into aBl block 30 
at a selected distance from the B 1 block 20 from which the signal originated. As the 
connection between the routing resources at the lowest level in the architecture and the 
routing resources in the middle level of the architecture, the E-tabs 52 provide that the " 
place and route of signals both inside and outside the Bl blocks 30 may be implemented 
independently from one another. * 

In FIG. 4, the expressway routing channels Ml, M2, and M3 and the E-turn 44 
have been omitted for clarity. As further depicted in FIG. 4, in addition to the horizontal 
and Vertical BC routing channels 50-1 and 50-2 associated with each Bl block 30, there 
are also associated with each Bl block 20 four LM routing channels 54-1 through 54-4 
and first and second DC interconnect conductors 56-1 and 56-2. The BC routing 
channels 50, the LM routing channels 54, and the DC interconnect conductors 56 
provide significantly better performance than a strict hierarchy, and further help avoid 
congesting the expressway routing channels Ml, M2, and M3. The BC routing 
channels 50 and the LM routing channels 54 combine to form two meshes. One is a 
mesh connection within aBl block 20, and a second is a mesh connection between B 1 
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blocks 20. 

The BC routing channels 50 provide portions of the two meshes. In the portion 
of the mesh providing connection between adjacent Bl blocks 30, each horizontal and 
vertical BC routing channel 50-1 and 50-2 share an E-tab 52 with a horizontal or 
vertical BC routing channel 50-1 and 50-2 in an adjacent Bl block 30 that may be 
employed to couple a signal between adjacent B 1 blocks 30 in a first direction. Further, 
each horizontal and vertical BC routing channel 50-1 and 50-2 share a BC extension 
58 with a horizontal or vertical BC routing channel 50-1 and 50-2 in an adjacent Bl 
block 30 that may be employed to couple a signal between adjacent B 1 blocks 20 in a 
second direction. The BC extensions 58 provide a one-to-one coupling between the 
interconnect conductors of the BC routing channels 50 on either side of the BC 
extensions 58. Accordingly, each BC routing channel 50, in the horizontal and vertical 
directions is coupled to the adjacent Bl blocks 30 in the corresponding horizontal and 
vertical directions by a E-tab 52 in a first direction along both the horizontal and vertical 
and in a second direction along both the horizontal and vertical by a BC extension 58. . 

From drawing FIG. 4, it should be appreciated that the LM routing channels 54-1 
through 54-4 pass through the Bl block 30 as two vertical LM routing channels 51-1 
and 54-4 and two horizontal LM routing channels 54-2 and 54-3, and that the 
intersections 60 of the vertical and horizontal LM routing channels 54 are hardwired 
along a diagonal. - 

The LM routing channels 54 also provide portions of the two meshes. In the 
portion of the mesh formed along with BC routing channels between Bl block 30, each 

14 



WO 00/49718 



PCT/US00/04477 _ 



of the four LM routing channels 54-1 through 54-4 in each Bl block 30 shares an LM 
extension 62 with an LM routing channel 54-1 through 54-4 in an adjacent Bl block 30 
in either the corresponding horizontal or vertical direction that may be employed to 
couple a signal between adjacent Bl blocks 30 in either the horizontal or vertical 
direction. The LM extensions 62 provide a one-to-one coupling between the r 
interconnect conductors of the LM routing channels 54 on either side of the LM 
extensions 62. Accordingly, between adjacent Bl blocks 30 there are two LM routing 
channels 54 from each of the adjacent Bl blocks coupled by a LM extension 62 on all 
sides of adjacent Bl blocks 30. ' 

The DC interconnect conductors 56-1 and 56-2 form a high performance direct' 
connection between the logic elements in adjacent Bl blocks 30 to implement data path 
functions such as counters, comparators, adders and multipliers. As will be described ' 
below, each Bl block 20 includes four clusters of logic elements. Preferably, each of the 
four clusters includes two three input look-up tables (LUT3), a single two-input look-up 
table (LUT2), and a D-type flip-flop (DFF). In the DC interconnect conductor routing 
path, each of the DC interconnect conductors 56-1 and 56-2 is multiplexed to an input 
to a separate one of the two LUT3s in each of the four cluster of a Bl block 30. The DC 
interconnect conductors 56-1 and 56-2 are connecented between vertically adjacent B l 
blocks 30 as is illustrated in FIG. 4. 

Turning now to FIG. 5, a portion of an I/O block 14, including a GIO routing 
channel 16 is depicted in greater detail. The portion of the I/O block 14 is shown above 
a B2x2 tile 32 that is disposed along the edge of a B 16x16 tile 12. In FIG. 2, a B2x2 tile 
32 disposed along the edge of a B 16x16 tile 12 is illustrated with an expressway routing 
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channel M3 coupled to an F-tab 40. The interconnect conductors in the vertical BC 
routing channels 50 and the vertical LM routing channels 54 in the B2x2 tile 32 are also 
depicted. In FIG. 5, the F-tab 40 described with regard to FIG. 2, is distributed as the F- 
tabs 70-1 through 70-6, each of which is coupled to one of- the six groups of nine 

expressway routing channel M3. Further, as 
described above, between the B2x2 tile 32 on the edge of a B 16x16 tile 12 and an I/O 
block 14 are disposed two freeway routing channels 20. 

In the I/O block 14, input/output (I/O) pins 72 are coupled to input buffer 74 
inputs and high impedance output buffer 76 outputs. The I/O pins 72 may carry clock, 
set, enable or global signals. However, the pins nearest the center of the FPGA typically 
have the lowest skew over the FPGA, and are therefore preferably reserved for the ' 
highest performance clock signals. The output of each of the input buffers 72 is coupled . 
to the input of a boundary scan register (BSR) 78, and is also passed by the BSR 78 so 
that it is coupled to a first input of global multiplexer 80 associated with the GIO routing 
channel 16. 

The output of each of the input buffers 74 may also be programmably coupled to 
a delay lock loop (DLL) reference line 80 or DLL feedback line 82 by reprogrammable 
elements 38 depicted by open circles. Only one of the reprogrammable elements is 
depicted by the reference numeral 84 to avoid overcomplicating the drawing figure. 
The reprogrammable elements 84, with one exception to be mentioned below, are 
preferably, SRAM pass devices, although those of ordinary skill in the art will readily 
appreciate that other types of programmable elements may also be suitably employed 
according to the present invention. 
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The input of each of the high impedance output buffers 76 may be programmably 
coupled to the DLL feedback line 82 by a reprogrammable element 84. The DLL 
feedback line 82 may also be programmably coupled to interconnect conductors in the 
BC routing channel 50 by programmable elements that are preferably tri-state buffers 
86. The input of each of the high impedance output buffers 76 is also connected to one 
of the distributed F-tabs 70-1 through 70-5, and may also be coupled by a 
reprogrammable element to interconnect conductors in the LM routing channels 54 of 
the B2x2 tile 32. The F-tab 70-6 may be programmably coupled to either the output of 
each BSR 78 or to the enable input of each high impedance output buffer 76. 3 

The output of each BSR 78 is also coupled to a second input of a separate global 
multiplexer 80 and to a separate F-tab 70-1 through 70-5. The DLL clock output line 88 
is coupled to a third input of each of the global multiplexers 32, and each of the F-tabs 
7.0-1 through 70-5 is coupled to a fourth input of a separate global multiplexer 80. 

The outputs of the global multiplexers 80 are coupled to tristatable buffers 90 
whose outputs are coupled to interconnect conductors in the GIO routing channels 16. 
Accordingly, from the above discussion it should be appreciated that at the output of 
the tristatable buffers 90 a global signal may be provided to the FPGA architecture 10 ' 
from any of four locations, namely, an input buffer 74, the output of a BSR 78, from the 
output of a DLL on the DLL output line 88, and from a signal in. a B 16x1 6 tile 12 via an 
F-tab 70-1 through 70-4. 

It should be observed that more than one tristatable buffer 90 may drive a single 
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interconnect conductor in the GIO routing channel 16. The number of tristatable buffers 
90 driving a single interconnect conductor in the GIO routing channel 16 depends .on 
the number of B 16x16 dies 12 in the FPGA architecture 10. When the FPGA 
architecture 10 is .1, 2, 3, or 4 B 16x16 tiles 12 wide, then the number of tristatable buffers 
90 preferably coupled to a single interconnect conductor in the GIO routing channel 16 
are 2, 4, 6, or 8, respectively. As depicted in FIG. 1, each of the GIO routing channels 16 
from both the top and bottom of the FPGA architecture 10 are fed into a 32-bit GSB 32 
in each of the B 16x16 tiles 12. 

Turning now to FIG. 6, the distribution of the global signals to one of four B8x8 
tiles 36 in a B 16x1 6 tile 12 by the GSB 18 is illustrated. It should be appreciated that the 
GSB 18 illustrated in FIG. 6 provides signals to each of the four B8x8 tiles 36 in a 
B 16x16 tile 12 in a manner similar to the single B8x8 tile 36 that is illustrated. The B8x8 
tile 36 includes four B4x4 tiles 34. 

Disposed between the GSB 18 and each of the B8x8 dies is a G3 routing channel 
100. Preferably each G3 routing channel 100 has twelve interconnect conductors. The 
G3 routing channels 100 form intersections 102 with the expressway routing channels 
M3. Disposed at selected ones of the intersections 102 are reprogrammable elements 
that couple can couple interconnect conductors in G3 routing channels 100 with the 
interconnect conductors in the expressway routing channels M3 at the selected ones of 
the intersections 102. The reprogrammable elements are preferably SRAM pass devices. 

At the freeway routing channels 20, selected interconnect conductors in the G3. 
routing channels 100 are coupled to global tabs (G-tabs) 104. A G-tab 104 is an active 
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device that includes tri-state buffers and a matrix of reprogrammable switches. The 
reprogrammable switches are preferably an SRAM pass device. The interconnect 
conductors in the freeway routing channels 20 and the G3 routing channel 100 that are 
fed into an G-tab 104 may be coupled by the reprogrammable switches to many of the 
Other interconnect conductors in the freeway routing channels 20 and the G3 routing 
channel 100 that come into the G-tab 104. A description of the implementation of the a 
G-tab 104 is beyond the scope of this disclosure and will not be made herein to avoid 
overcomplicating the disclosure and thereby obscuring the present invention. Further, 
the interconnect conductors in the freeway routing channels 20 and the G3 routing 
channel 104 that are fed into a G-tab 104 continue in the same direction through the*G- 
tab 104, even through the interconnect conductors are coupled to other interconnect ' 
conductors by the reprogrammable switches. 

For each B4x4 tile 34, signals on a G3 routing channel 100 and the GSB 18 are 
coupled into the B4x4 tile 34 by a G3 switch matrix (G3MAT) 106 on a B4x4 utility 
routing channel 108. • In FIG. 7, a G3MAT 106 is depicted in greater detail. The GSB 18 
is preferably partitioned into sixteen GENERAL PURPOSE/ENABLE interconnect 
conductors 110, eight SET interconnect conductors 112, and eight CLOCK 
interconnect conductors 114. A G3 routing channel 100 is also illustrated. In the G3 
MAT 106, the interconnect conductors in the G3 routing channel and the GSB 18 form 
intersections with the B4x4 utility routing channel 108. The interconnect conductors in 
the B4x4 utility routing channel 108 are buffered by tri-state buffers 116. Disposed at 
selected intersections are reprogrammable elements 118 depicted as open circles. To 
avoid overcomplicating the drawing figure only a single tri-state buffer 116, and a single 
reprogrammable element is indicated by the reference numeral 118. 
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The G3 routing channels 100 and GSB 18 are also coupled to user SRAM 
modules 120 by the SRAM switching block 122. As illustrated in FIG. 8, in the SRAM 
switching block 122, the eight CLOCK interconnect conductors 112 of the GSB 18 can 
be coupled to the write and read clock (WRCK, RCK) inputs oHhe user SRAM blocks 
120, the sixteen GENERAL PURPOSE/ENABLE interconnect conductors 108 can be 
coupled to the write-enable and read-enable (WEN, REN) inputs of the user SRAM 
blocks 120, and the G3 routing channels 100 can be coupled to the WRCK./RCK inputs, 
and the WEN/REN inputs of the user SRAM blocks 120. 

FIG. 9 illustrates a B4x4 utility routing channel 108 from a G3MAT 106 coupled 
into a Bl block 30 according to the present invention. The connection of the. B4x4 
utility routing channel 108 to the remaining B 1 blocks 30 in the B4x4 tile 34 are similar 
to those depicted. 

Each Bl block 30 includes four clusters 130-1 through 134-4 of devices. Within 
the Bl block 30, the BC routing channels 50 and the LM routing channels 54 have 
been omitted for to avoid overcomplicating the drawing figure. Each of the four clusters 
130-1 through 130-4 includes fust and second LUT3s 132-1 and 132-2, respectively, a 
LUT2 134, and a DFF 136. Each of the LUT3s 132 have first, second, and third inputs 
indicated as "A", "B", and "C", and a single, output indicated as "Y". Each of the 
LUT2s 134 have first and second inputs indicated as "A" and "B", and a single output 
indicated as "Y". With a LUT3 132, any three input boolean logic function may be 
implemented, and with a LUT2 134 any two input boolean logic function may be 
implemented. 

20 



WO 00/49718 



PCT/USOO/04477 



Each DFF 136 has a data input indicated as "D", an enable input indicated as 
"E", a clock input, a preset input, and a data output indicated as-"Q". The DFF 136 
may also be 'configured as a latch. In each of the clusters 130-1 through 130-4, the 
outputs "Y" of the LUT3s 132-1. and 132-2 are multiplexed to the input of DFF 136, 
and further multiplexed with the output "Q M of the DFF 136 to form first and second 
outputs of each of the clusters 130-1 through 130-4. 

The B4x4 utility routing channel 108 from a G3MAT 106 is coupled to the inputs 
of four 8-input multiplexers 138, 140, 142, and 144 that select from the GENERAL 
PURPOSE/ENABLE signals 1 10, the CLOCK signals 1 14 and the SET signals 1 12; 
respectively. As such, the multiplexers 138, 140, 142, and 144 are labelled G, E, C, and S. 
It should be noted that for the remaining B 1 blocks 30 in the B2x2 tile 32, the ■ ■ 
multiplexers 142 and 144 may be employed, but multiplexers similar to the multiplexers 
138 and 140 must be provided for each of the remaining Bl blocks 30 in the B2x2 tile 
32 ' 

The outputs of multiplexers 138, 140, 142, and 144 are coupled to tri-state buffers 
146, and the outputs of the tri-state buffers 146 are coupled into each of the four clusters 
130-1 through 130-4 on the signal lines 148, conveniently depicted as G, C, S,*and E to 
correspond to the multiplexers 138, 140, 142, and 144 to which they are coupled. 

In each of the clusters 130-1 through 130-4, the output of the multiplexer 138 on 
signal line G is coupled to the A input of LUT3 132-1 and to. the B input of LUT3 132-2. 
The output of the multiplexer 140 on signal line E is coupled to the B input of LUT3 
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132-1 and to the A input of LUT3 132-2. The output of the multiplexer 142 on signal 
line C is coupled to a multiplexer whose output is coupled to the C input of the LUT3 
132-1 in clusters 130-1 and 130-2 and the C input of the LUT3 132-2 in clusters 130-3 
and 130-4. The output of the multiplexer 144 on signal line S is coupled to a multiplexer 
whose output is coupled to the S input of the LUT3 132-1 in clusters 130-3 and 130-4 
and the C input of the LUT3 132-2 in clusters 130-1 and 130-2. Finally, the outputs of 
multiplexers 140, 142, and 144 (multiplexers E, C, and S) are coupled to the enable, 
preset, and clock inputs of the DFF 136 in each of the clusters. 

Each of the DC interconnect conductors 56-1 and 56-2 is multiplexed in a serial 
fashion with the C and S lines to the "C" inputs of LUT3s in each cluster 130-1 through 
130-4. For example, in the serial connection, the DC interconnect conductor 56-1 is 
multiplexed with the C line to the "C" input of the LUT3 132-1 of the cluster 130-1. 
Next, the "Y" output of the LUT3 132-1 in cluster 130-1 is multiplexed with the S line 
to the "C" input of the LUT3 132-2 in cluster 130-2. Next, the "Y" output of the 
LUT3 132-2 in cluster 130-2 is multiplexed with the S line to the "C" input of the LUT3 
132-1 in cluster 130-3. Next, the "Y" output of the LUT3 132-1 in cluster 130-3 is 
multiplexed with the C line to the "C" input of the LUT3 132-2 in cluster 130-4. 
Finally, the "Y" output of the LUT3 132- in cluster 130-4 passes out of the Bl block 20, 
and is multiplexed to the "C" input of the LUT3 132-1 in cluster 130-1 of the Bl block 
20 disposed vertically below. The DC interconnect conductors 56-2 is similarly 
connected. 

While embodiments and applications of this invention have been shown and 
described, it would be apparent to those skilled in the art that many more modifications 
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than mentioned above are possible without departing from the inventive- concepts 
herein. The invention, therefore, is not to be restricted except in the spirit of the 
appended claims. 
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What is Claimed is: 

1. A global signal distribution architecture for an FPGA architecture 
comprising: 

a plurality of multiplexers, each of said plurality of multiplexers having a 
plurality of inputs and an output; • ■ 

a plurality of global I/O lines, each of said plurality of global I/O lines 
coupled to a separate output of one of said plurality of multiplexers; 

a global signal distribution bus coupled to said plurality of global I/O lines; 
said global signal distribution bus spanning a highest level in the FPGA architecture; 

a plurality of switch matrices, each of said switch matrices coupled to said 
global signal distribution bus to a plurality of utility conductors; and 

at least one multiplexer associated with a lowest level in the FPGA 
architecture coupled to said plurality of utility conductors. 
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